Rerun ansible plays for failed hosts

 · 1 min read
 · Stefan Hellmann
Last updated: March 01, 2025

Sometimes i have to run an ansible playbook on multiple servers for example to install a software on all servers. Normally this job is successfull on > 90% of the servers. But i have to ensure that this software is installed to all servers. So i have to rerun the playbook on all failed servers.

I run this playbook with the -limit parameter and copy & paste all failed servers from the previous run. Thats not very efficent and there is a quite better way to do this:

Ansible retry files

Ansible has an option to run a playbook and log all failed hosts to a file. This file could be used to rerun the playbook and specify this file as limit parameter.

An example:

ANSIBLE_RETRY_FILES_ENABLED=True ANSIBLE_RETRY_FILES_SAVE_PATH=/tmp/ansible-retries ansible-playbook install_software.yml

This ansible-playbook will create a .retry file with all failed hosts: /tmp/ansible-retries/install_software.retry

Then, after fixing some problems ;-), you can rerun the playbook with:

ansible-playbook install_software.yml --limit @/tmp/ansible-retries/install_software.retry

This will run the playbook only on all failed servers.

BONUS

For simplicity you can configure the retry file path in your ansible configuration file and only enable it on needed jobs. To configure the retry file path you have to put the following in your ~/.asnible.cfg

[defaults]
retry_files_enabled = False
retry_files_save_path = /tmp/ansible-retries

I have created an bash alias to simple run ansible-playbook with enabled retry file:

alias ansible-playbook-retry="ANSIBLE_RETRY_FILES_ENABLED=True ansible-playbook"